AITopics | partial jacobian

7e02f2910ea7911a37c4691f4201c878-Paper-Conference.pdf

Neural Information Processing SystemsFeb-15-2026, 11:53:36 GMT

artificial intelligence, layernorm, machine learning, (19 more...)

Neural Information Processing Systems

Country:

North America > United States > Maryland > Prince George's County > College Park (0.14)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
North America > United States > California > San Mateo County > Menlo Park (0.04)
Asia > Middle East > Jordan (0.04)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.67)

Add feedback

Critical Initialization of Wide and Deep Neural Networks using Partial Jacobians: General Theory and Applications

Neural Information Processing SystemsDec-26-2025, 05:28:17 GMT

Deep neural networks are notorious for defying theoretical treatment. However, when the number of parameters in each layer tends to infinity, the network function is a Gaussian process (GP) and quantitatively predictive description is possible. Gaussian approximation allows one to formulate criteria for selecting hyperparameters, such as variances of weights and biases, as well as the learning rate. These criteria rely on the notion of criticality defined for deep neural networks. In this work we describe a new practical way to diagnose criticality.

critical initialization, partial jacobian, wide and deep neural network, (6 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

Critical Initialization of Wide and Deep Neural Networks using Partial Jacobians: General Theory and Applications

Neural Information Processing SystemsOct-8-2025, 23:32:47 GMT

Deep neural networks are notorious for defying theoretical treatment.

artificial intelligence, layernorm, machine learning, (19 more...)

Neural Information Processing Systems

Country:

North America > United States > Maryland > Prince George's County > College Park (0.14)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
North America > United States > California > San Mateo County > Menlo Park (0.04)
Asia > Middle East > Jordan (0.04)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Critical Initialization of Wide and Deep Neural Networks using Partial Jacobians: General Theory and Applications

Neural Information Processing SystemsJan-19-2025, 10:28:20 GMT

Deep neural networks are notorious for defying theoretical treatment. However, when the number of parameters in each layer tends to infinity, the network function is a Gaussian process (GP) and quantitatively predictive description is possible. Gaussian approximation allows one to formulate criteria for selecting hyperparameters, such as variances of weights and biases, as well as the learning rate. These criteria rely on the notion of criticality defined for deep neural networks. In this work we describe a new practical way to diagnose criticality.

critical initialization, general theory and application, wide and deep neural network, (6 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.93)

Add feedback

Critical initialization of wide and deep neural networks through partial Jacobians: general theory and applications to LayerNorm

Doshi, Darshil, He, Tianyu, Gromov, Andrey

arXiv.org Machine LearningNov-30-2021

Deep neural networks are notorious for defying theoretical treatment. However, when the number of parameters in each layer tends to infinity the network function is a Gaussian process (GP) and quantitatively predictive description is possible. Gaussian approximation allows to formulate criteria for selecting hyperparameters, such as variances of weights and biases, as well as the learning rate. These criteria rely on the notion of criticality defined for deep neural networks. In this work we describe a new way to diagnose (both theoretically and empirically) this criticality. To that end, we introduce partial Jacobians of a network, defined as derivatives of preactivations in layer $l$ with respect to preactivations in layer $l_0

jacobian, layernorm, wide and deep neural network, (11 more...)

arXiv.org Machine Learning

2111.12143

Country:

North America > United States > Rhode Island (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Asia > Middle East > Jordan (0.04)

Genre: Research Report (0.81)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Filters

Collaborating Authors

partial jacobian

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

7e02f2910ea7911a37c4691f4201c878-Paper-Conference.pdf

Critical Initialization of Wide and Deep Neural Networks using Partial Jacobians: General Theory and Applications

Critical Initialization of Wide and Deep Neural Networks using Partial Jacobians: General Theory and Applications

Critical Initialization of Wide and Deep Neural Networks using Partial Jacobians: General Theory and Applications

Critical initialization of wide and deep neural networks through partial Jacobians: general theory and applications to LayerNorm